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1. 


Introduction 


Error-tree  direct  methods  for  the  inversion  of  numerical 
and  polynomial  matrices  are  available  in  the  literature  [1] 

[2].  In  this  paper  we  describe  a  parallel  error-free  high- 
order  convergent  matrix  inversion  method  for  matrices  over 
integers,  based  on  the  Newton- Schultz  iterative  scheme  [3] 

[4]  and  the  p-adic  approximation  [5-9].  Some  of  the  impor¬ 
tant  aspects  of  this  scheme  are: 

(i)  Inversion  of  matrices  over  p-adic  fields,  analogously 
to  inverting  or  reciprocating  the  numbers,  without 
any  convergence  problem. 

(ii)  The  exact  and  simultaneous  determination  of  the 

rational  elements  of  the  inverse  matrix  in  p-adic 
digit  parallel  fashion  with  a  quadratic  or  higher  rate, 

(iii)  Easy  realization  of  the  scheme  and  its  variants  (higher- 
order  convergent  extensions)  by  parallel  matrix  multi¬ 
plications  . 

This  paper  is  organized  in  seven  sections.  In  the  second 
section  wje  outline  the  principle  of  the  Newton-Schultz  scheme 
for  reciprocating  numbers.  The  third  section  describes  the 
reformulation  of  the  Newton-Schultz  scheme  in  an  algebraic 
setting  to  compute  the  p-adic  approximant  to  the  inverse  of  a 
matrix  over  the  ring  of  integers.  In  the  fourth  section  we 
describe  the  extended  Euclidean  algorithm  that  converts  a 
given  p-adic  approximant  over  a  range  of  rationals  into  an 
equivalent  rational.  The  fifth  section  contains  an  example. 


-»«*■* 


* 


In  Section  6  we  briefly  deal  with  the  solution  of  a  linear 
system  of  equations,  having  a  linear  convergence  rate. 
Several  remarks  pertaining  to  possible  extensions  and  gen¬ 
eralizations  are  provided  in  the  last  section. 

l 

i 

I 

l 


2 .  The  principle 

Let  f (x)  be  a  real  function  of  the  real  variaole  x  and 
x=a  be  a  root  of  f(x)=0.  We  assume  that: 

(a)  f  (x) ,  f'(x)  and  f"(x)  are  continuous  in  a  neighbor¬ 
hood  [a,b]  of  x=.'i ;  (b)  x=  i  is  an  isolated  root  in 

[a,b];  (c)  f • (x)  and  f"(x)  do  not  vanish  in  fa,b]. 

The  search  for  the  root  x=o  entails  finding  the  root 
of  the  equation 


x  =  x  - 


f  •  (X) 


Mx)  . 


Since  1'  (:O=0  there  exists  a  neighborhood  of  x=ct  such  that 

the  sequence  (x. }  defined  bv 
1  i  0 

Xn  =  xn-i  “  f  (xn-l) //f '  ^n-l^  (n=l,2,...)  (1) 

converges  to  x=a  if  the  first  apDroximation  x=xQ  lies  in  this 

neighborhood.  Applied  to  the  function  f(x)=l/x-a  (1)  gives 

the  Newton-Schultz  scheme 

x  =  x  ..  ( 2-ax  ,  )  (2) 

n  n-1  n-1 

The  sequence  (2)  converges  to  a  ^ .  The  matrix  inversion  algori¬ 
thm  to  be  described  in  the  next  section  is  by  analogy  based  on 


the  sequence  of  iterates  defined  by  (2)  [3]  [4]. 
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3 .  The  Newton- Schultz  method 

Let  A=[a^j]  be  a  matrix  over  the  ring  of  integers  Z  and 
p  a  prime  such  that  det  A  mod  p^O.  (The  reason  for  this  will 
became  clear  later.)  The  algorithm  first  constructs  A  ^  mod  p 
and  using  this  in  the  Newton-Schultz  recurrence  obtains  a  segment¬ 
ed  p-adic  representation  of  the  inverse  matrix  (5-9]. 

Theorem  1 :  There  exists  a  matrix  sequence  sucb  that 

21  — 

Ap. j  mod  p  =  I  for  all  i^O,  where  A  is  the  matrix  to  be  inverted 

and  I  is  the  identity  matrix;  B2i  is  the  inverse  of  A  (in  Z)  mod 

2i  , 

p^  (or  B^i  is  the  p-adic  approximant  of  A  1 ) . 

Proof :  We  shov’  the  sequence  can  be  9enerated  recur¬ 

sively  and  then  prove  by  induction  that  it  has  the  property 

2 ^ 

stated,  namely,  AB0 i  mod  p  =  I.  The  first  member  of  the  sequence 

A. 

is  obtained  in  a  priming  step  by  solving 

AB^  mod  p  =  I 

by  Prussian  elimination  or  some  other  method.  It  amounts  to 
ci nc'.ing  the  inverse  of  A  .in  Z  mod  p.  Then  in  a  powering  step 
'..o  use  the  recurrence  relation 

B2 i  =  B2 i-1  (21  -  AB2i-l)  mod  p2  (i^l)  (3) 

to  construct  the  successive  iterates. 

.  2^ 

To  see  that  the  theorem  holds  let  AB2i  mod  p  =  I  be  true 

for  i  =  n-1 (n^l) ;  then,  by  (3) 

2n  2n 
(AB2n)  mod  p  =  AB2n-l(2l  -  AB2n-l)  mod  p 


-  -r-  . 


Since  AB2n-l  mod  p  =  I  by  the  induction  hypothesis,  we  have 

_n-l 


AB2n-l  =  I  +  P 


Jn-1 ' 


where  E  .  is  the  error  matrix.  Thus  we  can  write 

n-1 

2n  2n  1  2n-l  2  n 

AB2n  mod  p  =  (I  +  p  En-1^  ( 1  ~  P  En-1^  mot^  P 

Since  by  construction  the  theorem  holds  for  n=P,  it  is  true  for 

all  n^O  by  induction. 


Our  algorithm  first  obtains  B2k  by  iterating  k  times,  where 
k  is  the  minimum  integer  satisfying  the  inequality 


Ip  -1 


n  n 


>  n 


i=l  j=l 


a2.)1/2 


This  inequality  ensures  that  the  largest  element  of  the  inverse 
matrix  lies  within  the  range  of  the  segmented  p-adic  representa¬ 
tion  of  the  corresponding  rational  [5]  [8]. 

Let  N  denote  a  positive  integer  satisfying  the  inequality 

N  <  n/eS  1 5 ) 

We  define  a  finite  subset  F  of  the  rational  numbers  Q  as  the  set 

F  =  {a  =  0< I c I <N  and  0<|d|<N} 

N  d  —  - 

We  call  the  set  F  the  order  N  Farey  fractions,  or  simply  Farey 
rationals  of  order  N. 

If  p  and  k  are  properly  chosen  to  satisfy  (4)  then  the 
rationals  F^  which  are  mapped  onto  their  segmented  p-adic 
representations  in  B2k  can  be  uniquely  recovered  using  an  algori¬ 
thm  which  is  based  on  the  extended  Euclidean  algorithm  for  finding 
the  graatest  common  divisor  of  two  integers  [10]  [11] . 


-- r  -  -  •  -* 


Vr  -  c afi&rZ 


Then 


“1  2 

ab  mod  p  =  w  ( 6 ) 

_  I  2^ 

since  b  exists  mod  p  ,  due  to  the  fact  that  det  A  mod  p/0. 

In  the  following  section  we  describe  how  to  recover  a/b 
given  w,  provided  (4)  is  satisfied.  This  algorithm  filters  out 
a  very  small  subset  of  rationals  among  which  the  desired  rational 
belonging  to  occurs.  We  will  call  the  function  that  computes 
a/b  given  w,the  EUCLID;  thus  EUCLID (w)=a/b. 


Remark 


The  number  k  determined  from  (4)  is  generally  larger  than 
desired;  so  to  iterate  k  times  entails  much  superfluous  computa¬ 
tion.  A  practical  method  of  avoiding  this  would  be  to  compute 
EUCLID  (B2k)  and  EUCLID (B2k+1 )  starting  with  some  reasonable  k 
and  stop  as  soon  as  they  are  equal.  This  would  unambiguously 
determine  the  inverse. 


4. 


The  Euclidean  algorithm  [11]  constructs  three  pairs  of  num¬ 
bers  (u.  ,u|)  ,  (a^b^,  (ti,t|)  for  each  i=0 , 1 , 2  ,  .  .  .  ,  k  starting 

with  UQ-pr,UQ=0,  aQ=w,bQ=l  and  ending  when  t^=0,  a&  illustrated 
in  Table  1;  here  the  symbol  [  ]  denotes  the  lower  integral  part 

Note  that  the  q^'s  here  correspond  to  the  continued  fraction 
expansion  [5]  [12]  of  pr/w. 

It  can  easily  be  shown  that  the  pairs  (a^,b^)  in  Table 
satisfy  the  following  conditions  [10]  [11]: 

Cross-product  rule: 

!<V*W  +  !ai+i'bi'  ’  Pr  i  2n2+1 

Monotonicity : 

lai+ll  -  !  a  i  I  '  with  ao==w'ak=1  (8) 

!bi+ll  -  lbil  with  b0==1,bk=w  ^  mod  pr 

IT  —  2. 

where  w  is  such  that  gcd(w,p  )=1  and  w  denotes  the  multipli¬ 
cative  inverse  of  w  mod  pr. 

It  is  now  necessary  to  show  that  (i)  there  exists  a  pair 
(aj,bj)  '*'n  1  which  satisfies  the  condition  of  a  Farey 

rational  'Section  3),  and  (ii)  ,  such  a  pair  is  unique  in  the  sense 
that  there  exists  no  other  pair  belonging  to  FN . 

To  prove  this,  we  use  the  fact  that  ai  (starting  with  aQ=w) 
successively  decreases  to  1;  and  b^  (starting  with  b  =1 )  succes 
sively  increases  to  w  when  gcd(w,p  )  =  1. 

Let  us  assume  that  for  some  j,  bj  has  already  increased  from 
1  to  | N * |  with  |N' |  <  | N |  and  is  close  to  | N [ ,  and  the  correspond¬ 
ing  aj  has  already  decreased  from  w  to  | N” |  where  |N"|  >  |N|  and 


* 


is  close  to  | N | .  Then  using  (7)  we  can  prove  that  the  succeed¬ 
ing  pair  (a.  ,#  b .  .  )  will  have  to  be  in  FXI  or  in  other  words 
3+i  j+i  N 

a  pair  of  the  form  (a.,.,  b .  .  )  with  la.,,!  <  N  and  |b.  ,  I  •  N 

3  +  1  j  +  l  1  j  +1 1  —  1  +  1 

which  skips  a  Farey  rational  belonging  to  F^  cannot  exist. 

For  if  |  a  j  |  _>  N+l  and  |bjj  <_  N  and  |a.+^|  <_  N  and  !b^+^J  ^ 

2 

N+l,  we  have  iaj+i‘bjl  5.  N  .  Using  this  in  (7)  we  obtain 
2 

i a .  - b .  n  [  >  N  +1.  But  we  have  la.  I  >  N+l.  Therefore  lb.  ,  I  < 

d  3+1  —  3  -  '3+1- 

2 

(N  +1 ) / (N+l )  =  [N] .  Hence  our  assumption  jbj+^|  >  N  is  false. 

We  will  now  show  that  there  is  only  one  such  rational  be¬ 
longing  to  F  .  In  other  words,  we  will  show  that  if  for  some  j, 
(a^/bj)  belongs  to  F^  then  (a  j  +  ^/b_. +  1 )  cannot  be  in  F^ .  Note  that 
the  cross-product  is  maximum  when 

I  a 3 |  =  N,  jbjl  =  N  -  1 


aj+1l  “  N-l, 


lbj+1l  *  N. 


In  such  a  case 


laj*bj+1l  +  |b.-aj+1l  =  (N-l)2  +  N2  <  2N2  +  1 

would  still  be  short  of  satisfying  (7).  Notice  that  for  any 
other  choice  of  a  j  ,  b  ^  ,  a..+1,  bj  +  l  the  condition  (7)  would  be 
more  severely  violated.  Also  when  [ a ^  j  =  | b ^  j  =  N,  it  is  not 
possible  for  |a^+1 |  =  N,  since  aj+^  would  become  zero  by  the 
algorithm  in  Table  1. 

Thus  a  p-adic  approximant  (Hensel  Code  [5] )  with  the  weight 
w  corresponds  to  the  rational  aj/bj  belonging  to  F^  and  the 
conversion  is  complete. 

Remarks 

(i)  The  class  of  rationals  generated  by  the  above  algorithm 
may  contain  a  rational  (in  non-reduced  form)  whose  reduced 


form  is  in  F^;  but  this  is  an  invalid  choice.  (See 
example . ) 

£ 

(ii)  If  gcd(w,  p  )/l,  the  factor  is  taken  out  and  the  result 
adjusted  suitably. 

Example 

Let  p=5,  r=4,  and  w=448.  Hence  N_<17.  We  now  show  in 
Table  2  the  computations  corresponding  to  Table  1  of  the  algo¬ 
rithm.  The  Farey  rational  is  11/7  (and  not  5/60). 


5 •  Matrix- inversion  example 


Let  A  = 


Let  p  =  3 


Bn  = 


B2  =- 


1  [A]  !  ,  = 


B8  " 


1 

4920 

2460 


B 


16 


32285040 

16142520 


-1 

2 

1 

2 

2 

1 

0 

1 

2 

0 

7 

8 


0 

4921 

5741 

0 

32285041 

37665881 


2 

4 

-2 


1 

0 

1 

*1  = 

60 

61 

20 

30 

71 

50 

10761680 

26904200 


(mod  3) 


(mod  3  = 


(mod 


,4_ 


(mod  3 


16 


>'ys- 


9) 


01) 


6561) 


=43046721) 


We  find  that 


)  , 


EUCLID(B16) 


1 

0 

1 

3/4 

1/4 

-  1/4 

3/8 

1/8 

-  5/8 

EUCLID ( Bg ) =A 


Note  that  the  inverse  matrix  elements  are  simultaneously  deter¬ 
mined  in  p-udic  digit  parallel  fashion  with  a  quadratic  rate  of 
convergence. 


6. 


Solution  of  a  system  of  linear  equations  by  linear  convergence 
We  now  briefly  consider  the  problem  of  determining  the  solu¬ 


tion  to  a  system  of  linear  equations  iteratively. 

Let  Ax=b  be  a  system  of  linear  equations  such  that  det  A  mod 
p  7*  0,  p  being  a  prime.  Let  A=A^  mod  p  and  b1=b  mod  p.  We  first 
solve  A 1  x^=bjmod  p  by  Gaussian  elimination  (say)  and  there¬ 
after  use  the  iterative  scheme 

x(k+1)  =  (p  A^1  M  x(k)  +  A"1  b)  mod  pk+1 (k=l , 2 , . . . ) 
where  A=A^-p  M  and  M  is  the  error  matrix.  We  can  easily  show  by 
induction  that 


( Ax  ^ k  ^  -  b)  mod  pk  =  0. 

Then,  our  algorithm  is  formally: 

Step  1  Solve  A^  x^=b^  mod  p. 

Step  2  Use  x(k+1)=(p  A"1  M  x(k)  +  A^1  b)  mod  pk+1  to  obtain 
the  next  iterate. 

Vr  W  1 

Step  3  If  EUCLID (x)  =  EUCLID(x  )  stop;  else  go  to  2. 

Remark 

Note  that  this  scheme  for  the  solution  of  linear  equations 
has  only  a  linear  order  convergence.  However,  it  has  the  ad¬ 
vantage  of  using  only  matrix-vector  multiplications  unlike  the 
Newton  iterative  scheme  where  matrix-matrix  multiplications  are 
involved. 


Concluding  remarks 

(i)  The  scheme  ,Qf  formula  (3)  gives  rise  to  quadratic 
convergence.  It  is  possible  to  use  schemes  having 
higher-order  convergence.  The  following  scheme, 
for  example, 

B,n  =  B  n-1  (I  +  (I  -  B,n-1)  (21  -  AB-,n-l))  mod  p' 


,n 


(10) 
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has  cubic  convergence. 

(ii)  We  have  assumed  throughout  that  det  A  mod  p  ^  0,  but  in 
actual  computation  we  cannot  assume  this  a  priori.  We 
can  keep  choosing  one  prime  after  another  until  we  suc¬ 
ceed;  but  this  is  very  expensive  computationally.  It 
would  be  better  to  use  the  method  of  rank  1  update,  which 
is  as  follows: 

We  apply  our  algorithm  to  A+V  instead  of  A  where 

v  =  |a„  I  _  _^t  is  arbitrarily  (11) 

chosen . 


(b1,b2, . . . ,bn]  =  ab 


Finally,  we  use  the  formula 


A'1  =  ( A+V )  ” 1  +  l*tn'i_VIA+vT.l  (1 

1-b  (A+V)  a 

to  retrieve  the  actual  inverse.  This  method  always  suc¬ 
ceeds  except  when  A  ^=0  over  Z  and  A  mod  p=0. 

(iii)  It  is  possible  to  extend  the  scope  of  our  algorithm  for  the 
determination  of  the  g-inverse  of  a  singular  matrix. 

(iv)  The  algorithm  determines  all  the  elements  of  the  inverse 
matrix  simultaneosly  in  p-adic  digit  parallel  fashion  with 
a  quadratic  or  higher-order  convergence  rate  [13] . 
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(v)  In  solving  a  system  of  linear  equations,  we  note  that 


we  have  split  the  matrix  A  in  a  very  special  way,  name¬ 
ly,  A=A^  -  p  M.  We  could  try  splitting  it  as  in  the 
Jacobi,  Gauss-Seidel  or  SOR  method  [31;  but  unfortun¬ 
ately,  the  convergence  in  our  sense  is  not  realizable  in 
these  cases. 

(vi)  We  can  invert  polynomial  matrices  whose  elements  are  in 
z  (2)  by  constructing  the  inverses  of  the  matrices  Z[x] 
mod  p^  for  several  primes  p^  and  then  using  the  Chinese 
Remainder  Theorem  to  construct  the  actual  inverse  [1] . 
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Table  1 

Euclidean  Algorithm 
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Table  2 

Example  of  Euclidean  algorithm 
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